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(54) Improved optimization techniques for data compression 



(57) Methods and apparatuses are provided relating 
to the encoding of data, such as, e.g., images, video, 
etc. For example, certain methods include processing 
at least a portion of data using a plurality or different 
quantization functions to produce a plurality of corre- 
sponding quantized portions of data, and selectively 
outputting one of the quantized portions of data based 
on at least one threshold value. The method may also 
include dividing initial data into a plurality of portions and 



classifying the portion of data based on at least one clas- 
sification characteristic. Here, for example, there may 
be a threshold value that is associated with the classifi- 
cation characteristic. Additional syntax may be adopted 
to enableconsiderabiy higher compression efficiency by 
allowing several alternative motion prediction cases. A 
high efficiency time stamp independent Direct Mode is 
also provided which considers spatial motion vector pre- 
diction as well with stationary temporal predictors. 
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Description 

RELATED PATENT APPLICATIONS 



[0001] This U.S. Patent Application claims the benefit of priority from, and hereby incorporates by reference the 
entire disclosure of : co-pending U.S. Provisional Patent Application Serial No. 60/393,894. filed July 5, 2002 : and titled 
"Rate/Distortion Optimization Techniques for Image and Video Compression". 



TECHNICAL FIELD 



[0002] Tho present invention relates generally to computers and like devices, and more particularly to methods, 
apparatuses and systems for compressing/encoding data and decompressing/decoding data. 



BACKGROUND 



[0003] There is a continuing need for improved methods and apparatuses for compressing/encoding data and de- 
compressing/decoding data, and in particular image and video data. Improvements in coding efficiency allow for more 
information to be processed, transmitted and/or stored more easily by computers and other like devices. With the 
increasing popularity of the Internet and other like computer networks, and wireless communication systems, there is 

20 a desire to provide highly efficient coding techniques to make full use of available resources. 

[0004] Rate Distortion Optimization (RDO) techniques are quite popular in video and image encoding/decoding sys- 
tems since they can considerably improve encoding efficiency compared to more conventional encoding methods. 
[0005] Additional information, for example, may be found in a Master of Science in Computer Science and Engineer- 
ing thesis titled "A Locally Adaptive Perceptual Masking Threshold Model for Image Coding", by Trac Duy Tran while 

25 at the Massachusetts Institute of Technology, May 1994. 

[0006] As there is a continuing desire to provide even more encoding efficiency, there is a need for improved methods 
and apparatuses that further increase the performance of RDO or other like techniques to achieve improved coding 
efficiency versus existing systems. 

30 SUMMARY 

[0007] The present invention provides improved methods and apparatuses that can be used in compressing/encod- 
ing, decompressing/decoding data, and/or otherwise processing various forms of data including, but not limited to 
image, video and/or audio data. 
35 [0008] The above-stated needs and others are met, for example, by a method that includes processing at least a 
portion of data using a plurality of different quantization functions to produce a plurality of corresponding quantized 
portions of data, and selectively outputting one of the quantized portions of data based on at least one threshold value. 
The method may also include dividing initial data into a plurality of portions and classifying the portion of data based 
on at least one classification characteristic. Here, for example, there may be a threshold value that is associated with 
^0 the classification characteristic. 

[0009] By way of example, the initial data may include image data, video data, audio data, speech data, and the like. 
The portion that is selected may take the' form of a block,, a macroblock, a slit, a slice, a section, or the like. The 
classification characteristic(s) may include an edge characteristic, a texture characteristic, a smoothness characteristic, 
a luminance characteristic, a chrominance characteristic, a color characteristic, a noise characteristic, an object char- 
acteristic, a motion characteristic, a user preference characteristic, a user interface focus characteristic, a layering 
characteristic, a timing characteristic, a volume characteristic, a frequency characteristic, a pitch characteristic, a tone 
characteristic, a quality characteristic, a bit rate characteristic, a data type characteristic, a resolution characteristic, 
an encryption characteristic, or the like. 

[0010] In certain exemplary implementations, the plurality of different quantization functions includes at least two 
operatively different Deadzone Quantizers. Here, for example, a Deadzone Quantizer may be an adaptive coverage 
Deadzone Quantizer, a variable coverage size Deadzone Quantizer, or the like. The method may also include encoding 
the quantized portions. The method may further include performing Rate Distortion Optimization (RDO) to select the 
quantized portions of data. 

[0011] In accordance with still other exemplary implementations, another method includes performing at least one 
characteristic analysis on at least one portion of image data, selectively setting at least one adaptive quantization 
parameter within an encoder based on the characteristic analysis, and encoding the portion of image data with the 
encoder. 

[0012] in yet another exemplary implementation, a method is provided that includes causing at least one portion of 
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image data to be encoded using at least two different Deadzone Quantizers, and identifying preferred encoded data 
in an output of one of the at least two different Deadzone Quantizers based on a Rate Distortion Optimization (RDO) 
decision associated with at least one decision factor. 

[0013] The above stated needs and others are also satisfied by a method that includes causing at least one portion 
of image data to be encoded using a first Deadzone Quantizer, determining if an output of the first Deadzone Quantizer 
satisfies at least one decision factor and if so, then outputting the output of the first Deadzone Quantizer. If not, then 
causing the portion of image data to be encoded using at least a second Deadzone Quantizer that is different from the 
first Deadzone Quantizer. Here ; for example, the method may also include identifying an acceptable encoded version 
of the portion of the image data based on an RDO decision or other like. 

[001 4] in still another exemplary implementation, a method includes performing image analysis on at least one portion 
of image data, performing block classification on the analyzed portion of image data, performing Deadzone Quantization 
of the block classified portion of image data, and performing encoding of the Deadzone Quantized portion of image 
data. Here, for example, the image analysis may include edge detection analysis, texture analysis, etc. 
[0015] In accordance with a further exemplary implementation a method is provided that includes causing at least 
one portion of video image data to be encoded using at least two different encoders wherein at. least one of the two 
different encoders includes a Deadzone Quantizer operativeiy configured to support a Non Residual Mode of the video 
image data, and identifying preferred encoded frame data in an output of one of the two different encoders based on 
a Rate Distortion Optimization (RDO) decision associated with at least one decision factor. 

[0016] Another exemplary method includes selectively varying at least one Lagrangian multiplier that is operativeiy 
configuring encoding logic having a quantizing function based on at least one characteristic of at least one portion of 
image data, and encoding the portion of the image data using the encoding logic. 

[0017] In still another implementation, an exemplary method includes encoding at least a portion of video image data 
using encoder logic, and causing the encoder logic to output syntax information identifying a type of motion vector 
prediction employed by the encoder logic. 

[0018] A method for use in conveying video encoding related information includes encoding video data, and selec- 
tively setting at least one descriptor within a syntax portion of the encoded video data, the descriptor identifying an 
adaptive spatial/spatio-temporal encoding associated with at least one B frame encoded with the video data. Another 
method for use in conveying video encoding related information includes encoding video data, and selectively setting 
at least one descriptor within a syntax portion of the encoded video data, the descriptor identifying an adaptive copy/ 
30 motion-copy skip mode in at least one inter frame encoded with the video data. 

[0019] An exemplary method for use in a time stamp independent mode encoding of video considering stationary 
temporal/spatial portions of video frames is provided. Here, for example ; the method includes selectively applying 
spatial prediction of motion associated with at least, one portion of a video frame in a video sequence, and, if temporal 
motion prediction information for a reference portion of another video frame is zero, then setting the spatial prediction 
35 of motion to zero. 

BRIEF DESCRIPTION OF THE DRAWINGS 
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[0020] A more complete understanding of the various methods, apparatuses and systems of the present invention 
may be had by reference to the following detailed description when taken in conjunction with the accompanying draw- 
ings wherein: 



Fig. 1 is a block diagram that depicts an exemplary device, in the form of a computer, which is suitable for use with 
certain implementations of the present invention. 

Figs 2(a-d) are graphs depicting exemplary selectable Deadzone Quantizers, in accordance with certain imple- 
mentations of the present invention. 

Fig. 3 is a flow diagram illustrating an exemplary method for selectively applying different quantization processes 
to data, in accordance with certain implementations of the present invention. 

Fig. 4 is a block diagram depicting exemplary logic for selectively applying different quantization processes to data, 
50 in accordance with certain implementations of the present invention. 

Fig. 5 is a block diagram depicting exemplary logic for selectively applying different quantization processes to data, 
in accordance with certain further implementations of the present invention. 

Figs 6(a-b) are block diagrams depicting exemplary logic for selectively applying different quantization processes 
to data, in accordance with still other implementations of the present invention. 

Fig. 7 is a block diagram illustratively depicting exemplary logic for selectively applying different quantization proc- 
esses to image data, in accordance with certain implementations of the present invention. 

Fig. 8 is a chart listing exemplary syntax information for use with logic for selectively applying different prediction 
methods for motion vector, in accordance with certain implementations of the present invention. 
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Fig. 9 is a block diagram depicting exemplary logic for selectively applying different encoding schemes to video 
data, in accordance with certain implementations of the present invention. 

Fig. 1 0 is a block diagram depicting exemplary logic for selectively applying different encoding schemes to video 
data, in accordance with certain further implementations of the present invention. 
5 Fig. 11 is an illustrative diagram certain features of a video sequence employing selectively applied encoding 

schemes, in accordance with certain implementations of the present invention. 

Fig. 12 is a flow diagram depicting an exemplary method for SpatioTemporal Prediction for Direct Mode video 
sequences : in accordance with certain implementations of the present invention. 

10 DESCRIPTION 



[0021] Turning to the drawings, wherein like reference numerals refer to like elements, the invention is illustrated as 
being implemented in a suitable computing environment. Although not required, the invention will be described in the 
general context of computer-executable instructions, such as program modules, being executed by a server computer, 

15 which may take the form of a personal computer, a workstation, a dedicated server, a plurality of processors, a main- 
frame computer, etc. Generally, program modules include routines, programs, objects : components, data structures, 
etc. that perform particular tasks or implement particular abstract data types. The invention may also be practiced in 
distributed computing environments where tasks are performed by remote processing devices that are linked through 
a communications network. In a distributed computing environment, program modules may be located in both local 

20 and remote memory storage devices. 

Exemplary Computing Environment: 

[0022] Fig. 1 illustrates an example of a suitable computing environment 120 on which the subsequently described 

25 methods and arrangements may be implemented. 

[0023] Exemplary computing environment 120 is only one example of a suitable computing environment and is not 
intended to suggest any limitation as to the scope of use or functionality of the improved methods and arrangements 
described herein. Neither should computing environment 1 20 be interpreted as having any dependency or requirement 
relating to any one or combination of components illustrated in computing environment 120. 

30 [0024] The improved methods and arrangements herein are operational with numerous other general purpose or 
special purpose computing system environments or configurations. 

[0025] As shown in Fig. 1 , computing environment 1 20 includes a general-purpose computing device in the form of 
a computer 130. The components of computer 130 may include one or more processors or processing units 132, a 
system memory 1 34, and a bus 1 36 that couples various system components including system memory 1 34 to proc- 
35 essor 132. 

[0026] Bus 1 36 represents one or more of any of several types of bus structures, including a memory bus or memory 
controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus 
architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) 
bus. Micro Channel Architecture (MCA) bus, Enhanced ISA (EISA) bus, Video Electronics Standards Association (VE- 
SA) local bus, and Peripheral Component Interconnects (PCI) bus also known as Mezzanine bus. 
[0027] Computer 130 typically includes a variety of computer readable media. Such media may be any available 
media that is accessible by computer 130, and it includes both volatile and non-volatile media, removable and non- 
removable media. 

[0028] In Fig. 1, system memory 134 includes computer readable media in the form of volatile memory, such as 
random access memory (RAM) 140, and/or non-volatile memory, such as readonly memory (ROM) 138. A basic input/ 
output system (BIOS) 142, containing the basic routines that help to transfer information between elements within 
computer 130, such as during start-up, isstored in ROM 138. RAM 140 typically contains data and/or program modules 
that are immediately accessible to and/or presently being operated on by processor 132. 

[0029] Computer 130 may further include other removabie/non-removable, volatile/non-volatile computer storage 
media. For example, Fig. 1 illustrates a hard disk drive 144 for reading from and writing to a non-removable, non- 
volatile magnetic media (not shown and typically called a "hard drive"), a magnetic disk drive 1 46 for reading from and 
writing to a removable, non-volatile magnetic disk 148 (e.g., a "floppy disk"), and an optical disk drive 150 for reading 
from or writing to a removable, non-volatile optical disk 152 such as a CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM 
or other optica! media. Hard disk drive 144, magnetic disk drive 146 and optical disk drive 150 are each connected to 
55 bus 136 by one or more interfaces 154. 

[0030] The drives and associated computer-readable media provide nonvolatile storage of computer readable in- 
structions, data structures, program modules, and other data for computer 130. Although the exemplary environment 
described herein employs a hard disk, a removable magnetic disk 148 and a removable optical disk 152, it should be 
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appreciated by those skilled in the art that other types of computer readable media which can store data that is acces- 
sible by a computer, such as magnetic cassettes, flash memory cards, digital video disks, random access memories 
(RAMs), read only memories (ROM), and the like, may also be used in the exemplary operating environment. 
[0031] A number of program modules may be stored on the hard disk, magnetic disk 148, optical disk 152, ROM 
138, or RAM 140, including, e.g., an operating system 158, one or more application programs 160, other program 
modules 1 62. and program data 1 64. 

[0032] The improved methods and arrangements described herein may be implemented within operating system 
158, one or more application programs 160, other program modules 162, and/or program data 164. 
[0033] A user may provide commands and information into computer 130 through input devices such as keyboard 
166 and pointing device 168 (such as a "mouse"). Other input devices (not shown) may include a microphone, joystick, 
game pad, satellite dish, serial port, scanner camera, etc. These and other input devices are connected to the process- 
ing unit 132 through a user input interface 170 that is coupled to bus 136, but may be connected by other interface 
and bus structures, such as a parallel port, game port, or a universal serial bus (USB). 

[0034] A monitor 1 72 or other type of display device is also connected to bus 1 36 via an interface, such as a video 
adapter 174. In addition to monitor 172, personal computers typically include other peripheral output devices (not 
shown), such as speakers and printers, which may be connected through output peripheral interface 175. 
[0035] Computer 130 may operate in a networked environment using logical connections to one or more remote 
computers, such as a remote computer 182. Remote computer 182 may include many or all of the elements and 
features described herein relative to computer 130. 

[0036] Logical connections shown in Fig. 1 are a local area network (LAN) 177 and a general wide area network 
(WAN) 1 79. Such networking environments are commonplace in offices, enterprise-wide computer networks, intranets, 
and the Internet. 

[0037] When used in a LAN networking environment, computer 130 is connected to LAN 177 via network interface 
or adapter 1 86. When used in a WAN networking environment, the computer typically includes a modem 1 78 or other 
means for establishing communications over WAN 179. Modem 178, which may be internal or external, may be con- 
nected to system bus 1 36 via the user input interface 1 70 or other appropriate mechanism. 

[0038] Depicted in Fig. 1 , is a specific implementation of a WAN via the Internet. Here, computer 1 30 employs modem 
178 to establish communications with at least one remote computer 182 via the Internet 180. 

[0039] In a networked environment, program modules depicted relative to computer 130, or portions thereof, may 
be stored in a remote memory storage device. Thus, e.g., as depicted in Fig. 1 , remote application programs 1 89 may 
reside on a memory device of remote computer 182. it will be appreciated that the network connections shown and 
described are exemplary and other means of establishing a communications link between the computers may be used. 

Improved Rate/Distortion Optimization Techniques: 



[0040] Although the following sections describe certain exemplary methods and apparatuses that are configured to 
initially compress/encode and decompress/decode image data and/or video data, those skilled in the art of data com- 
pression will recognize that the techniques presented can be adapted and employed to compress/encode and decom- 
press/decode other types of data. For example, certain methods and apparatuses may be adapted for use in com- 

^o pressing/encoding audio data, speech data and the like. 

[0041] Furthermore, although the exemplary methods and apparatuses can be configured in logic within a computer, 
those skilled in the art will recognize that such methods and apparatuses may be implemented in other types of devices, 
appliances, etc. The term "logic" as used herein is meant to include hardware, firmware, software, or any combination 
thereof, and any other supporting hardware or other mechanisms as may be required to fulfil! the desired functions, 

45 either fully or partially. 

[0042] With this in mind, several exemplary schemes are presented than can be implemented in some form of logic 
to support the processing of data. 

[0043] In accordance with certain aspects of the present invention, several novel techniques are presented for im- 
proving the performance of video and/or image encoding/decoding systems. In some exemplary implementations, 
these techniques are employed for use with image/video coding standards such as JPEG and JVT (Joint Video Team) 
Standard (e.g., H.264/AVC). By way of example, for the case of JVT, syntax changes are presented that can operatively 
enable an adaptive selection of different prediction types that can be used for predicting certain parameters of the 
video, such as, e.g.. motion information. 

[0044] Rate Distortion Optimization (RDO) techniques are quite popular in video and image encoding/decoding sys- 
tems since they can considerably improve encoding efficiency compared to more conventional encoding methods. 
There is a continuing desire to provide even more encoding efficiency. This description describes methods and appa- 
ratuses that can significantly improve the performance of RDO or other like techniques to achieve improved coding 
efficiency versus existing systems. 
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[0045] In accordance with certain further aspects of the present invention, one may also further combine and improve 
RDO or other like techniques with image pre-analysis concepts, such as : for exampie ; edge and/or texture detection, 
by using adaptive and/or variable size Deadzone quantizers depending on the characteristics of the image or macrob- 
iock. 

5 [° 046 ] This description also introduces/defines some additional exemplary syntax changes that can be implemented 
to enable a combination of different prediction schemes at a frame level, for example, thus further improving the per- 
formance of video coding schemes. By way of example, one technique arises from the fact that for some frames or 
portions of a sequence, motion may be more correlated in a temporal domain than in a spatial domain, or vice versa. 
This could be exploited by performing a pre-analysis of the frame, but also through encoding the same frame using 

10 two or possibly more different methods and selecting a preferred method in an RDO and/or RDO-like sense. The 
preferred method may then be signaled in the resulting data, for example, in the frame header, to allow a decoder to 
properly decode the frame. Here, for example, one such exemplary method may include the possible variation of a 
Direct Mode within B frames by either using a spatial prediction or a temporal prediction, or a Skip Mode motion vector 
selection within P frames by using either spatial predicted motion vector parameters or temporally predicted motion 

15 vector parameters, or even zero. 

Applying adaptive and/or variable Deadzone Quantization to data according to one or more characteristics 
(parameters): 

20 [0047] In image data compression systems part of the data to be compressed, such as, for example, blocks or 
macroblocks, may actually include more significant information that when compared to other information (data) should 
be coded differently (e.g., at a higher priority, in a higher quality, etc.). One way to accomplish this is to use different 
quantizer values. For example, in certain implementations a smaller quantizer value may be used for "more important 
information" and a larger quantizer value may be used for the "less important information". However, doing so would 
also typically require the transmission of information identifying each quantizer value used for each block, macroblock, 
groups of macroblocks, etc., so that subsequent decompressing/decoding is successful. Unfortunately, such additional 
information tends to increase the compressed overhead and the complexity of the encoder. Thus, instead of increasing 
efficiency there may actually be a reduction in efficiency. 

[0048] Attention is drawn to Figs 2(a-d), which are each illustrative graphs depicting certain exemplary Quantizers 

30 employable within certain exemplary image/video coding schemes. In each of Figs 2a-d, the vertical (y-axis) represents 
quantized values and the horizontal (x-axis) represents original values. The illustrated exemplary Deadzone Quantizer 
202 in Fig. 2(a), for example, is associated with a Deadzone Quantization A that can considerably improve coding 
efficiency versus a uniform quantizer. Conventional Deadzone Quantizers are often kept constant or uniform throughout 
the quantization process thus, possibly, not exploiting completely ail existing redundancies within the data. 

35 [0049] By considering or otherwise taking into account the importance of certain information within data and by 
adapting/modifying the Deadzone Quantizer at a block/macroblock basis, for example, an improvement in coding ef- 
ficiency may be achieved. This can be done, for example, by adapting the coverage of each quantization bin (e.g., 
along the x-axis), but without changing the reconstructed value. For example, compare Deadzone Quantizer 202 to 
Deadzone Quantizers 204, 206 and 208 of Figs2(b, c, and d), respectively. Here, the reconstructed values may remain 

40 constant throughout the coded data, unless of course a change in quantization parameters is signaled. For example, 
by increasing the zero bin (Fig. 2(c)) more data will be assigned to it, thus, depending on the compression scheme, 
achieving higher compression. There is obviously no need on the decoderto signal the change in the quantizer since 
the reconstructed values remain the same. Even though one could argue that such would impair the performance of 
the quantization, this is not necessarily always true if such processes are done selectively when certain condition(s) 

45 are satisfied, for example, if the compression is achieved using the new quantizer is considerably higher than the 
incurred distortion. 

[0050] In Fig. 2(b), for example, the Deadzone quantizer only affects the zero bin versus the first positive/negative 
bins, where as all other bins remain the same and there is no change to the reconstructed value used. An adaptive 
scheme with adaptive estimate of the Deadzone is also possible (e.g., using Rate distortion optimization and adaptive 
50 estimate). 

[0051] In accordance with certain exemplary implementations of the present invention, quatization selecting logic is 
therefore provided in the compressing/encoding system to select between different Deadzone Quantizers (or quanti- 
zation values) based on at least one characteristic or parameter. For example, the logic may select between different 
Deadzone Quantizers 202, 204, 206 and 208 according to certain image characteristics. 

[0052] More particularly, the logic may be configured to characterize texture and/or edges within the image data as 
representing "more important" information and therefore code such data in a mannerto provide a higher level of quality. 
The logic can use conventional texture analysis/detection and edge detection algorithms to support such decision 
processes. 
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[0053] With this concept in mind : reference is now made to Fig. 3, which is a flow-diagram illustrating a method 300 
for selectively applying different quantization processes to data, in accordance with certain implementations of the 
present invention. In act 302 : an initial data set is divided into a plurality of portions of data. For example, image or 
video data may be divided into a plurality of blocks, macroblocks, slits, slice, sections, etc. In act 304, at least one of 
the plurality of portions of data from act 302 is classified in some manner, for example, based on at least one charac- 
teristic or parameter. For example, Ihe classification may be based on edge characteristics, texture characteristics, 
smoothness characteristics, luminance characteristics; chrominance characteristics, color characteristics, noise char- 
acteristics, object characteristics, motion characteristics, user preference characteristics, user interface focus charac- 
teristics, layering characteristics; timing characteristics,' volume characteristics, frequency characteristics, pitch char- 
acteristics, tone characteristics, quality characteristics, bit rate characteristics; data type characteristics, resolution 
characteristics, encryption characteristics. 

[0054] In act 306, the classified portion from act 304 is processed using at least two of a plurality of quantization 
processes to produce corresponding quantized data, in act 308, a decision is made wherein one of the quantized data 
from act 306 is selected, for example, based on satisfying at least one threshold value or measurement that is asso- 
ciated with the classification parameter used in act 304. In act 310. the quantized data from act 308 is encoded in some 
manner that allows it to be subsequently decoded. 

[0055] Using method 300, logic can be provided that essentially analyzes portions of image or video data and deems 
certain portions to be more important than others. Different Deadzone Quantizers are then applied to the more important 
data portions and the resulting quantized data analyzed to determine which Deadzone Quantizer(s) satisfy a desired 
threshold requirement for this more important data. For example, a quality or noise threshold requirement may be 
enforced. 

[0056] By way of further example, in video sequences edges often play a significant role in motion compensation 
techniques. This is illustrated, for example, in the block diagram depicted in Fig. 7 that is described in greater detail 
below. Basically, it is possible to perform image analysis on an image (frame) or portion thereof, and according to the 
analysis decide which one of a plurality of Deadzone Quantizers is best to use to use, for example, according to Rate 
Distortion Optimization (RDO) criterion. 

[0057] With reference to Fig. 4, logic 400 illustrates an on-the-fly decision process in accordance with certain exem- 
plary implementations of the present invention. Here, an input frame/image (or portion thereof) is subjected to image 
analysis in block 402. In this example, the image analysis includes an edge detection and/or texture analysis capability 
and the output is provided to a Deadzone decision block 404. Deadzone decision block 404 then causes an encoder 
406 to use a specified or otherwise selected Deadzone Quantizer or quantization value(s) when encoding the input 
frame/image (or portion thereof). 

[0058] Rather than making an on-the-fly decision regarding Deadzone Quantization, logic 500 in Fig. 5, is config- 
urable to support method 300, wherein a plurality of Deadzone Quantizers are used and an RDO decision is made 
based on certain threshold criteria. Here, an input macroblock (MB)/image 502 (or other like portion) is (selectively) 
provided to different Deadzone encoding blocks 504, 506, 508, and/or 510, and the outputs from of these various 
Deadzone encoding blocks are analyzed in RDO decision block 512 and the selected encoded data output.* As illus- 
trated in this example, some or all of the Deadzone encoding blocks/processes may occur in parallel. In other imple- 
mentations, such processes may be timed to occur serially. 

[0059] Fig. 6(a) and Fig. 6 (b) illustrate logic wherein a selective recoding decision process is used. In Fig. 6(a), for 
example, input MB/image 502 is provided to Deadzone encoding blocks 602 and 604. A recoding decision block 606 
considers the output from Deadzone encoding block 602 and affects the selection 608 between the outputs of Deadzone 
encoding blocks 602 and 604. Recoding decision block 606 may also selectively initiate Deadzone encoding block 
604. In Fig. 6(b), logic 61 0 is similar to logic 600 but rather than having selection 608 associated with recoding decision 
block 606 includes an RDO decision block 612 that is configured to analyze the outputs from Deadzone encoding 
blocks 602 and 604 and decide which to output. 

[0060] Recoding decision block 606 in Figs 6(a, b) may be configured to make decisions based on various criteria. 
For example, in certain implementations, quality limits, rate limits and/or other RDO concepts/thresholds may be con- 
sidered. 

[0061] It is also possible to use the additional Deadzone quantizers only if some previously defined conditions are 
satisfied, such as for example the rate/quality is above a particular threshold, in certain exemplary a Deadzone Quan- 
tizer was successfully selected which had about a 30% larger Deadzone than the original. Other, possibly adaptive 
according to image characteristics such as AC frequencies or edge types, Deadzone quantizers may also be used. 
[0062] In Fig. 7, exemplary encoding logic 700 illustratively demonstrates how an initial image 702 (or portion thereof) 
is processed according to an image analysis process 704 to produce, in this example, edge detection data 706. Edge 
detection data 706 is then provided to a block classification process 708 to produce block classified data 710. Block 
classified data 710 is then provided along with initial image 702 to an image encoding process 712, which then produces 
encoded image 714. This is one example, of a simple encoding process for an image. Here, the image is analyzed (e. 
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g. using an edge detection algorithm) and then biocks are classified according to such information. Essentially a A/- 
ary matrix (DZ p ) is defined (N depends on the number of Deadzones defined) which later on assigns the proper Dead- 
zone (DZ) for the Quantizer (QP) assigned at Macroblock at position (/',/). 

Consideration of non residual modes (e.g., for B frame Direct Mode): 

[0063] Even though the above description might appear to be mainly focused on still images : the same concepts 
could be easily applied on video as well. For example,, the image pre-analysis and the Deadzone quantizers can be 
used in a rather straightforward manner in video compression as well. 

[0064] One case that appears quite interesting and can considerably benefit from the above concepts is the usage 
of Direct Mode within B frames. Direct Mode is basically a special mode which requires no transmission of motion 
parameters since such can be directly predicted through either spatial or temporal prediction. Additional information 
on Direct Mode can be found in co-pending U.S. Provisional Patent Application Serial No. 60/385,965. 
[0065] If there is no residue to be transmitted, efficiency can be improved further by the usage of a special mode, 
referred to herein as a Non Residual Direct Mode, in accordance with certain further aspects of the present invention. 
As described below, the Non Residual Direct Mode can be configured to exploit Run Length encoding (RLC) strategies. 
Here, for example, if the distortion incurred is small enough and the reduction in bitrate due to the higher efficiency of 
the RLC is significant enough, then a Deadzone quantizer may provide a desirable solution. The basic idea is to se- 
lectively cause or otherwise force the Direct Mode : under certain conditions, to be coded without Residue even though 
such exists. Schemes based on the same concepts as in Fig. 5, for example, as shown in Fig. 9 may be implemented 
wherein the non residual direct mode is also examined within the RDO process compared to all other available modes. 
In certain instances, the performance of such scheme may not be as good as expected since the RDO used is inad- 
equate for such cases. Other processes, which are dependent on the Quantization values, are also affected, such as, 
for example, an in-ioop filter (not shown) that used to remove blocking artifacts. More specifically, even though per- 
formance appears to be good at lower bit rates, performance may suffer significantly at higher bit rates; it could be 
even outperformed by the usage of a larger quantizer and without the consideration of the Non Residual Direct Mode. 
[0066] Simiiarto what was done in the images examples, in accordance with certain aspects of the present invention 
the logic can be Configured to consider such modes only if some previously defined conditions are satisfied and in 
particular if the residue associated with the Direct Mode is not significant. The logic may, for example, be configured 
to use as a condition an estimate of how significant this residue is by examining the Coded Block Pattern (CBP) of the 
Direct Mode. If the CBP, without considering chrominance information, is below a particular threshold then this may 
be considered to imply that the residue is not as significant and if skipped it might not incur too much distortion. Further, 
other imagecharacteristics such as the non existence of edges and texture may also be used within such video encoding 
processes/logic. 

[0067] In certain implementations, the logic may even extend this even further by examining whether only the chromi- 
nance residue could be removed while keeping all luminance residue intact. It is also possible to extend such concepts 
to all possible modes for a macroblock that is to examine whether by sending no residue entirely, or no chrominance 
for this mode would give better performance. Obviously though such could increase the encoding complexity even 
further. 

[0068] With regard to Fig. 9, logic 900 illustrates how an input MB/image 902 is provided to different Direct Mode (B 
frames) or Copy Mode (Inter frames) encoding process blocks 902, 904, 906, and 908 and the outputs from those 
process blocks provided to a frame based RDO decision block 910 that selects an appropriate output. 
[0069] With regard to Fig. 10, logic 1000 further illustrates how a scheme decision block 1002 and selection 1004 
can also be included to provide additional selectability depending on user inputs, the application, system requirements, 
etc. Here, scheme decision block 1002 selectively provides input MB/image to one or more of the Direct Mode (B 
frames) or Copy Mode (Inter frames) encoding process blocks 902, 904, 906, and 908. Selection 1 004 may be controlled 
by scheme decision block 1 002 or other processes/logic/inputs. 

Exemplary use of Lagranqian multiplier for B frames: 

[0070] RDO techniques are often based on the concepts of Langrangian Multipliers. For example, a specific mode 
may be selected that jointly minimizes distortion and bit rate. 
[0071] Such a function can be expressed as the minimization of: 

J (Mode I QP,X) = SSD( Model QP) + X-R(Mode \ QP) 
where QP is the macroblock quantizer, X is the Lagrangian multiplier for mode decision, and Mode indicates the 



8 



NSDOCID: <EP 



1379090A2.. I > 



EP 1 379 090 A2 



macroblock mode that is to be examined and possibly selected within the RDO process. 

[0072] By way of example, in certain implementations, the Lagrangian multiplier X can be selected for Inter or Intra 
frames as: 



\ P = 0.85 x 2 



OP 
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25 



or 
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. c v OP+ 5 ^ 10 
V = 5X 34^QP X eXP 



whereas in B frames, in most codecs such as JVT, this is selected as l e = 4x X fp . 
[0073] The additional weighting of A, was done in order to give preference to lower overhead modes, since particularly 
for B frames, modes can have large overhead due to the multiple motion information transmitted, while the lower 
overhead modes, such as the Direct Mode, could still provide a very good, in terms of RDO, performance. 
[0074] Based on certain experiments in accordance with the present invention, however, it has been found that the 
20 weighting should not be constant as described above, but instead it should be dependent again on the quantizer QP 
value. 

[0075] In particular, if one defines X B = f(QP) x X }P , from these experiments it has been found that two /(QP) functions 
that could be used with much better compression efficiency than the fixed f(QP) = 4 case are: 



r mo \ 
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[0076] This observation also comes from the fact that by having a very high X, one may also be affecting the accuracy 
of other important information, such as, for example, that of the motion prediction, which in consequence may have a 
negative impact to the encoding of the surrounding macroblocks. 

Using adaptive type selection of prediction dependent MB modes: 



[0077] As also described in co-pending U.S. Provisional Patent Application Serial No. 60/385,965, co-pending U.S. 
Provisional Patent Application Serial No. 60/376,005, and co-pending U.S. Patent Application Serial No. 10/186,284, 
45 sequences and frames may have different types of dominant motion correlation. 

[0078] In particular, for small objects with constant speed in a stationary background., the usage of motion vectors 
(MVs) from temporally adjacent frames (temporal domain) enable one to perform better prediction, and yield higher 
performance. Larger objects with smooth motion may instead have higher correlation in the spatial domain (adjacent 
macroblocks), whereas in other cases information from both spatial and temporal domain may be important for the 
prediction. These types of correlations are partly exploited, such as, for example within the Direct Mode in B and P 
frames and Skip on Motion Vector Predictor in P frames. For more information on Skip on Motion Vector Predictor in 
P frames see, e.g., Jani Lainema and Marta Karczewicz, "Skip mode motion compensation", document JVT-C027, 
JVT Meeting, Fairfax, May 2002. Thus, if the logic somehow signals which type of prediction is predominant at different 
frames, considerably higher performance can be achieved. 

[0079] Hence, in accordance with certain aspects of the present invention, the encoding logic is configured to signal 
or otherwise identify in some manner at the frame, slice, or some other like level, which prediction scheme for prediction 
dependent modes is to be used. One exemplary syntax for accomplishing this, as an example within JVT, is presented 
in chart 800 in Fig. 8. Such syntax could, of course, be modified/different in other encoding designs. 
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[0080] With regard to the exemplary syntax in chart 800. three (3) possible prediction cases are allowed for the P 
frame skip mode. In this example, there is a Motion-Copy prediction mode, a temporal prediction skip mode and the 
zero skip mode. Each one of these cases is assigned a value between {0 ; 1, 2}, which can be coded using either a 
fixed size codeword : in this example u(2) = 2 bits, or is even entropy coded (e.g., e(v) using UVLC or CABAC). Those 

5 skilled in the art will recognize other ways of conveying Such information that may also be employed. In chart 800, 
which illustrates Picture Layer RBSP Syntax within JVT with the addition of Adaptive Spatial/Spatio-temporal consid- 
eration for B Frames (direct_mv_spatial) and Adaptive Copy/Motion-Copy Skip mode in Inter frames (copy_mv_spatiai). 
If only these two modes are used, then the defined descriptors may take only 1 bit, thus u(n=1), but if more cases are 
to be used (spatial prediction with zero bias or consideration of stationary temporal prediction) then more bits could 

10 be assigned (n>1 ) or even use entropy coding for this parameter defined by e(v). 

[0081] For B frames, in this example, Spatial motion vector prediction may be used for all Direct Mode motion pa- 
rameters as one mode, and the temporally predicted parameters as a second mode. Other direct modes as described, 
for example, in co-pending U.S. Provisional Patent Application Serial No. 60/385,965 may also be considered/included. 
The encoding logic is configured to signal which prediction mode is to be used at the Frame or slice level. The selection 

15 can be performed, for example, using an RDO based scheme (e.g., Fig. 9). In certain implementations, the encoding 
logic may also make use a specific mode explicitly due to specific requirements placed on the encoder and/or decoder. 
In one particular example, considering that the Spatial Prediction is usually computationally simpler (e.g., requiring no 
division, no storage of the motion vectors, and is independent to timing information), it may be the preferred choice for 
some applications (e.g., Fig. 10). 

20 [0082] In others implementations, where such problems are not an issue, the combination may yield further improved 
encoding performance. One example of an encoded sequence is shown in the illustrative diagram of Fig. 1 1 . Here, P 
and B frames are shown along with a scene change. As illustrated by the arrows differently signaled P and B frames 
are shown for corresponding Skip ; or Direct mode macroblocks. Note also that the signaling can be an indication of 
how the encoding logic should perform the motion vector prediction for the motion vector coding, or for the prediction 

25 of other modes (e.g., Direct P described in co-pending U.S. Provisional Patent Application Serial No. 60/376,005 and 
co-pending U.S. Patent Application Serial No. 10/186,284). 

[0083] As shown in Fig. 11 , different frames signal different type of prediction for their corresponding Direct (B) and 
Skip (P) modes. P z , P T , and P M> define for example zero, temporal and spatial (Motion-Copy) prediction, and By and 
B SP define temporal and spatial prediction for Direct mode. 



30 



Time Stamp independent Direct Mode, with the consideration of stationary temporal/spatial blocks: 



[0084] Different types of prediction, especially for the Direct Mode in B frames, may be more appropriate for different 
types of motion and sequences. Using temporal or spatial prediction only, may in some cases provide acceptable 
35 performance, but in oLhers performance might be considerably worse. A solution as was described the preceding 
section, orforthe cases presented in co-pending U.S. Provisional Patent Application Serial No. 60/385,965 may provide 
even better performance. 

[0085] By way of example, one additional case is presented which appears to be quite efficient and that combines 
the performance of both temporal and spatial predictors, while tending to keep the spatial predictor simple by not 

40 requiring division, and/or which is timing independent. 

[0086] In certain implementations, spatial prediction may be more useful (e.g., due to its properties) than temporal 
prediction. Thus, for example, spatial prediction is used as the main prediction of Direct Modes. One possible exception 
is made when the motion information and the reference frame from the temporal predictors are zero. In such a case, 
the motion information and reference frame for the corresponding block of the direct mode is also considered to be 

45' zero. Furthermore, the spatial prediction is refined by also considering a spatial zero biased-ness and/or of stationary 
subpartitions. Accordingly, if any or some of the adjacent macroblocks or blocks to the currently predicted block have 
zero motion (or very close (e.g. integer motion vector is zero) and reference frame, then also the entire macroblock, 
or part of it is also considered to have zero motion. Both of these concepts help in protecting stationary backgrounds, 
which, in particular at the edges of moving objects, might be quite distorted if such conditions are not introduced. 

50 [0087] A flow diagram 1200, simplified for the case of a 16x16 Macroblock, is shown in Fig. 12. It is noted that such 
concepts using spatial prediction for the direct mode, may also be extended to even smaller blocks (e.g. 8x8 or 4x4) 
(or larger blocks or other shaped portions). In act 1 202, spatial predictors MV a , MV b , and MV C and temporal predictor 
MV t are provided to act 1204, wherein MV Djrect is set to a Median (MV a , MV b , MV C ). in act 1206 a decision is made 
based on MV t that leads to either act 1208 wherein MV Djrect is set to zero, or act 1210 for an additional decision. As 

55 result of act 1210, MV Djrect is either set to zero in act 1208 or not changed, and MV Djrect is output. 

[0088] In this description, several concepts related to Rate Distortion Optimization for the encoding of images, video 
sequences, or other types of data have been presented. Additional syntax has been demonstrated that may be adopted 
within video sequences that enables considerably higher compression efficiency by allowing several alternative pre- 
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diction cases, especially for cases such as the Skip and Direct Mode within P and B frames respectively, which may 
be signaled at the beginning of the image. A high efficiency time stamp independent Direct Mode for B frames has 
been presented which considers spatial motion vector prediction as well with stationary temporal predictors. Ail or part 
of the above methods and apparatuses can be implemented to significantly improve the performance of various image/ 
video/data coding systems. 

[0089] Although some preferred implementations of the various methods and apparatuses of the present invention 
have been illustrated in the accompanying Drawings and described in the foregoing Detailed Description, it will be 
understood that the invention is not limited to the exemplary embodiments disclosed, but is capable of numerous 
rearrangements, modifications and substitutions without departing from the spirit of the invention. 

Claims 

1. A method comprising: 

processing at least a portion of data using a plurality of different quantization functions to produce a plurality 
of corresponding quantized portions of data; and 

selectively outputting one of said quantized portions of data based on at least one threshold value. 



2. The method as recited in Claim 1 , further comprising: 

dividing initial data into a plurality of portions including said at least one portion of data; and 
classifying said at least one portion of data based on at least one classification characteristic. 

3. The method as recited in Claim 2, wherein said at least one threshold value is associated with said at least one 
classification characteristic. 

4. The method as recited in Claim 2, wherein said initial data includes data selected from a group comprising image 
data, video data, audio data, and speech data. 

5. The method as recited in Claim 2, wherein said initial data includes image data or video data and said at least one 
portion is selected from a group comprising a block, a macroblock, a slit, a slice, and a section. 

6. The method as recited in Claim 2, wherein said at least one classification characteristic is selected from a group 
of characteristics comprising an edge characteristic a texture characteristic, a smoothness characteristic, a lumi- 
nance characteristic, a chrominance characteristic, a color characteristic, a noise characteristic, an object char- 
acteristic, a motion characteristic, a user preference characteristic, a user interface focus characteristic, a layering 
characteristic, a timing characteristic, a volume characteristic, a frequency characteristic, a pitch characteristic, a 
tone characteristic, a quality characteristic, a bit rate characteristic, a data type characteristic, a resolution char- 
ge actoristic, and an encryption characteristic. 

7. The method as recited in Claim 1, wherein said plurality of different quantization functions include at least two 
operatively different Deadzone Quantizers. 



8. The method as recited in Claim 7, wherein at least one of said Deadzone Quantizers includes an adaptive coverage 
Deadzone Quantizer. 

9. The method as recited in Claim 7, wherein at least one of said Deadzone Quantizers includes a variable coverage 
size Deadzone Quantizer. 

10. The method as recited in Claim 1 , wherein selectively outputting said one of said quantized portions of data based 
further includes: 

encoding said one of said quantized portions. 

1 1 . The method as recited in Claim 1 , wherein selectively outputting said one of said quantized portions of data based 
on said at least one threshold value further includes performing Rate Distortion Optimization (RDO) to select said 
one of said quantized portions of data. 
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12. A method comprising: 



performing at least one characteristic analysis on at least one portion of image data; 

selectively setting at least one adaptive quantization parameter within an encoder based on said at least one 
characteristic analysis; and 

encoding said at least one portion of image data with said encoder. 

13. The method as recited in Claim 12, wherein said at least one characteristic analysis considers at least one image 
analysis characteristic selected from a group of characteristics comprising an edge characteristic, a texture char- 
acteristic, a smoothness characteristic, a luminance characteristic, a chrominance characteristic, a color charac- 
teristic, a noise characteristic, an object characteristic, a motion characteristic, a user preference characteristic, a 
user interface focus characteristic, a layering characteristic, a timing characteristic, a quality characteristic, a bit 
rate characteristic, a data type characteristic, and a resolution characteristic. 

14. The method as recited in Claim 12, wherein said at least one adaptive quantization parameter is associated with 
an adaptive coverage Deadzone Quantizer within said encoder. 

15. The method as recited in Claim 12, wherein selectively setting said at least one adaptive quantization parameter 
within said encoder based on said at least one characteristic analysis is performed on-the-fly while for each of a 
plurality of portions of said image data which includes a video frame. 



16. A method comprising: 



causing at least one portion of image data to be encoded using at least two different Deadzone Quantizers; and 
identifying preferred encoded data in an output of one of said at least two different Deadzone Quantizers based 
on a Rate Distortion Optimization (RDO) decision associated with at least one decision factor. 

17. A method comprising: 



30 causing at least one portion of image data to be encoded using a first Deadzone Quantizer; 

determining if an output of said first Deadzone Quantizer satisfies at least one decision factor and 
if so, then outputting said output of said first Deadzone Quantizer, 

if not, then causing said at least one portion of image data to be encoded using at least a second Dead- 
zone Quantizer that is different from said first Deadzone Quantizer. 

35 

18. The method as recited in Claim 17, further comprising: 



identifying an acceptable encoded version of said at least one portion of said image data based on a Rate 
Distortion Optimization (RDO) decision. 

40 

19. A method comprising: 

performing image analysis on at least one portion of image data; 
performing block classification on said analyzed portion of image data: 
45 performing Deadzone Quantization of said block classified portion of image data; and 

performing encoding of said Deadzone Quantized portion of image data. 

20. The method as recited in Claim 19, wherein said image analysis includes at least one type of analysis selected 
from a group comprising edge detection analysis and texture analysis. 

50 

21 . The method as recited in Claim 19, wherein said block classification is operatively configured based on said image 
analysis and said Deadzone Quantization is operatively configured based on said block classification. 

22. A method comprising: 

55 

causing at least one portion of video image data to be encoded using at least two different encoders wherein 
at least one of said two different encoders includes a Deadzone Quantizer operatively configured to support 
a Non Residual Mode of said video image data; and 
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identifying preferred encoded frame data in an output of one of said at least two different encoders based on 
a Rate Distortion Optimization (RDO) decision associated with at least one decision factor. 

23. The method as recited in Claim 22, wherein said Deadzone Quantizer is adaptive and selectively causes said Non 
5 Residual Mode to be used. 

24. The method as recited in Claim 22, wherein at least one of said encoders is operatively configured Lo support a 
Direct Mode of said video image data. 

10 25. The method as recited in Claim 22, wherein at least one of said encoders is operatively configured to support a 
Copy Mode of said video image data. 

26. The method as recited in Claim 22, wherein said Non Residual Mode is operatively configured based on a Run 
Length encoding (RLC) strategy. 



15 



20 



27. The method as recited in Claim 22 : wherein said Deadzone Quantizer is operatively adaptive. 

28. The method as recited in Claim 22, wherein said Deadzone Quantizer is operatively configured based on at least 
one characteristic associated with said at least one portion of video image data. 

29. The method as recited in Claim 22, wherein identifying said preferred encoded frame data further includes analyzing 
an amount of a residue associated with said Direct Mode frame data. 

30. The method as recited in Claim 29, wherein analyzing said amount of said residue further includes examining a 
25 Coded Block Pattern (CBP) of said Direct Mode frame data. 

31 . The method as recited in Claim 22, wherein said encoder having said Deadzone Quantizer is selectively configured 
to remove at least one type of residue selected from a group of residue data comprising chrominance residue data, 
luminance residue data, and all residue data. 



30 



35 



32. A method comprising: 

selectively varying at least one Lagrangian multiplier that is operatively configuring encoding logic having a 
quantizing function based on at least one. characteristic of at least one portion of image data; and 
encoding said at least one portion of said image data using said encoding logic. 

33. The method as recited in Claim 32, wherein said quantizing function includes a macroblock quantizer based on: 
40 J(Mode I OP, A.) = SSD(Mode I OP) + X-R(Mode I OP) 

wherein X is said Lagrangian multiplier for a mode decision and Mode indicates a macroblock mode that is 
examined using a Rate Distortion Optimization (RDO) process. 

45 34. The method as recited in Claim 32, wherein said Lagrangian multiplier is selected for Inter or Intra frames. 
35. The method as recited in Claim 33, wherein: 

50 * = f(OP) x X lp , 

and 



55 



f(QP) = rnaxl 2,min(4,^) I 
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36. The method as recited in Claim 33 : wherein: 



X = f(QP) x X lp \ 



and 
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f{QP) - max^min(4,^^) 
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37. A method comprising: 



15 encoding at least a portion of video image data using encoder logic: and 

causing said encoder logic to output syntax information identifying a type of motion vector prediction employed 
by said encoder logic. 



38. The method as recited in Claim 37, wherein said encoder logic is configured to selectively employ at least one 
type of motion vector prediction selected from a group comprising spatial motion vector prediction and temporal 
motion vector prediction and wherein said encoder logic is configured to selectively output said syntax information 
and corresponding encoded video image data based at least in part on a Rate Distortion Optimization (RDO) 
decision. 

25 39. A method for use in conveying video encoding related information, the method comprising: 

encoding video data; and 

selectively setting at least one descriptor within a syntax portion of said encoded video data, said descriptor 
identifying an adaptive spatial/spatio-temporal encoding associated with at least one B frame encoded with 
30 said video data. 

40. The method as recited in Claim 39, wherein encoding said video data includes encoding said video data in accord 
with a JVT Standard, and said at least one descriptor within said syntax portion of said encoded video data includes 
a direct_mv_spatial parameter with a picture layer portion of said encoded video data. 



41 . A method for use in conveying video encoding related information, the method comprising: 
encoding video data; and 

selectively setting at least one descriptor within a syntax portion of said encoded video data, said descriptor 
identifying an adaptive copy/motion-copy skip mode in at least one inter frame encoded with said video data. 

42. The method as recited in Claim 41 , wherein encoding said video data includes encoding said video data in accord 
with a JVT standard, and said at least one descriptor within said syntax portion of said encoded video data includes 
a copy_mv_spatial parameter with a picture layer portion of said encoded video data. 

43. A method for use in a time stamp independent mode encoding of video considering stationary temporal/spatial 
portions of video frames, the method comprising: 

selectively applying spatial prediction of motion associated with at least one portion of a video frame in a video 
50 sequence; and 

if temporal motion prediction information for a reference portion of another video frame is zero, then setting 
said spatial prediction of motion to zero. 



44. The method as recited in Claim 43, wherein said spatial prediction is spatial zero biased. 

45. The method as recited in Claim 43, wherein said spatial prediction further considers stationary sub-partitions of 
other portions in said video frame, and wherein, if at least one of said sub-partitions has a corresponding motion 
that is approximately zero, then said other portion is also considered to have zero motion. 
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46. A computer-readable medium having computer implementable instructions for configuring at least one processing 
unit to perform acts comprising: 

processing at least a portion of data using a plurality of different quantization functions to produce a plurality 
of corresponding quantized portions of data; and 

selectively outputting one of said quantized portions of data based on at least one threshold value. 

47. The computer-readable medium as recited in Claim 46, further comprising: 

dividing initial data into a plurality of portions including said at least one portion of data; and 
classifying said at least one portion of data based on at least one classification characteristic. 

48. The computer-readable medium as recited in Claim 47, wherein said at least one threshold value is associated 
with said at least one classification characteristic. 

49. The computer-readable medium as recited in Claim 47, wherein said initial data includes data selected from a 
group comprising image data : video data, audio data : and speech data, and wherein said initial data includes 
image data or video data and said at least one portion is selected from a group comprising a block, a macroblock, 
a slit, a slice, and a section. 

50. The computer-readable medium as recited in Claim 47, wherein said at least one classification characteristic is 
selected from a group of characteristics comprising an edge characteristic, a texture characteristic, a smoothness 
characteristic, a luminance characteristic, a chrominance characteristic, a color characteristic, a noise character- 
istic, an object characteristic, a motion characteristic, a user preference characteristic, a user interface focus char- 
acteristic, a characteristic, a timing characteristic, a volume characteristic, a frequency characteristic, a pitch char- 
acteristic, atone characteristic, a quality characteristic, a bit rate characteristic, a data type characteristic, a res- 
olution characteristic, and an encryption characteristic. 

51. The computer-readable medium as recited in Claim 46, wherein said plurality of different quantization functions 
include at least two operatively different Deadzone Quantizers and wherein at least one of said Deadzone Quan- 
tizers includes an adaptive coverage Deadzone Quantizer. 



52. The computer-readable medium as recited in Claim 46, wherein selectively outputting said one of said quantized 
portions of data based on said at least one threshold value further includes performing Rate Distortion Optimization 

35 (RDO) to select said one of said quantized portions of data. 

53. A computer-readable medium having computer implementable instructions for configuring at least one processing 
unit to perform acts comprising: 



performing at least one characteristic analysis on at least one portion of image data: 
selectively setting at least one adaptive quantization parameter within an encoder based on said at least one 
characteristic analysis; and 

encoding said at least one portion of image data with said encoder. 

45 54. The computer-readable medium as recited in Claim 53 ; wherein said at least one adaptive quantization parameter 
is associated with an adaptive coverage Deadzone Quantizer within said encoder. 

55. The computer-readable medium as recited in Claim 53, wherein selectively setting said at least one adaptive 
quantization parameter within said encoder based on said at least one characteristic analysis is performed on-the- 
fly while for each of a plurality of portions of said image data which includes a video frame. 

56. A computer-readable medium having computer implementable instructions for configuring at least one processing 
unit to perform acts comprising: 

55 causing at least one portion of image data to be encoded using at least two different Deadzone Quantizers; and 

identifying preferred encoded data in an output of one of said at least two different Deadzone Quantizers based 
on a Rate Distortion Optimization (RDO) decision associated with at least one decision factor. 



15 



i^7Qr>QnAO i 



EP 1 379 090 A2 



57. A computer-readable medium having computer impiementabfe instructions for configuring at least one processing 
unit to perform acts comprising: 

causing at least one portion of image data to be encoded using a first Deadzone Quantizer; 
5 determining if an output of said first Deadzone Quantizer satisfies at least one decision factor arid 

if so, then outputting said output of said first Deadzone Quantizer 

if not, then causing said at least one portion of image data to be encoded using at least a second Dead- 
zone Quantizer that is different from said first Deadzone Quantizer. 

10 58. The computer-readable medium as recited in Claim 57, further comprising: 

identifying an acceptable encoded version of said at least one portion of said image data based on a Rate 
Distortion Optimization (RDO) decision. 

15 59. a computer-readable medium having computer impiementable instructions for configuring at least one processing 
unit to perform acts comprising: 

performing image analysis on at least one portion of image data; 
performing block classification on said analyzed portion of image data; 
20 performing Deadzone Quantization of said block classified portion of image data; and 

performing encoding of said Deadzone Quantized portion of image data 

60. The computer-readable medium as recited in Claim 59, wherein said image analysis includes at least one type of 
analysis selected from a group comprising edge detection analysis and texture analysis. 

25 

61 . The computer-readable medium as recited in Claim 59 : wherein said block classification is operatively configured 
based on said image analysis and said Deadzone Quantization is operatively configured based on said block 
classification. 

30 62. A computer-readable medium having computer impiementable instructions for configuring at least one processing 
unit to perform acts comprising: 

causing at least one portion of video image data to be encoded using at least two different encoders wherein 
at least one of said two different encoders includes a Deadzone Quantizer operatively configured to support 
35 a Non Residual Mode of said video image data; and 

identifying preferred encoded frame data in an output of one of said at least two different encoders based on 
a Rate Distortion Optimization (RDO) decision associated with at least one decision factor. 

63. The computer-readable medium as recited in Claim 62, wherein said Deadzone Quantizer is adaptive and seiec- 
^0 tiveiy causes said Non Residual Mode to be used. 

64. The computer-readable medium as recited in Claim 62 ; wherein at least one of said encoders is operatively con- 
figured to support a Direct Mode of said video image data. 

45 65. The computer-readable medium as recited in Claim 62, wherein at least one of said encoders is operatively con- 
figured to support a Copy Mode of said video image data. 

66. The computer-readable medium as recited in Claim 62, wherein said Non Residual Mode is operatively configured 
based on a Run Length encoding (RLC) strategy. 
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67. The computer-readable medium as recited in Claim 62 ; wherein said encoder having said Deadzone Quantizer is 
selectively configured to remove at least one type of residue selected from a group of residue data comprising 
chrominance residue data, luminance residue data, and all residue data. 

68. A computer-readable medium having computer impiementable instructions for configuring at least one processing 
unit to perform acts comprising: 

selectively varying at least one Lagrangian multiplier that is operatively configuring encoding logic having a 
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quantizing function based on at least one characteristic of at least one portion of image data; and 
encoding said at least one portion of said image data using said encoding logic. 

69. A computer-readable medium having computer implementable instructions for configuring at least one processing ; 
unit to perform acts comprising: 

encoding at least a portion of video image data using encoder logic; and 

causing said encoder logic to output syntax information identifying a type of motion vector prediction employed 
by said encoder logic. 

70. A computer-readable medium having computer implementable instructions for configuring at least one processing 
unit to perform acts comprising: 

encoding video data; and 

selectively setting at least one descriptor within a syntax portion of said encoded video data, said descriptor 
identifying an adaptive spatial/spatio-temporal encoding associated with at least one B frame encoded with 
said video data. 

71. The computer-readable medium as recited in Claim 70. wherein encoding said video data includes encoding said 
video data in accord with a JVT Standard, and said at least one descriptor within said syntax portion of said encoded 
video data includes a direct_mv_spatiai parameter with a picture layer portion of said encoded video data. 

72. A computer-readable medium having computer implementable instructions for configuring at least one processing 
unit to perform acts comprising: 

encoding video data; and 

selectively setting at least one descriptor within a syntax portion of said encoded video data, said descriptor 
identifying an adaptive copy/motion-copy skip mode in at least one inter frame encoded with said video data. 

73. The computer-readable medium as recited in Claim 41 , wherein encoding said video data includes encoding said 
video data in accord with a JVT standard, and said at least one descriptor within said syntax portion of said encoded 
video data includes a copy_mv_spatial parameter with a picture layer portion of said encoded video data. 

74. A computer-readable medium having computer implementable instructions for configuring at least one processing 
35 unit to perform acts comprising: 

selectively applying spatial prediction of motion associated with at least one portion of a video frame in a video 
sequence; and 

if temporal motion prediction information for a reference portion of another video frame is zero, then setting 
40 said spatial prediction of motion to zero. 

75. The computer-readable medium as recited in Claim 43, wherein said spatial prediction is spatial zero biased. 

76. The computer-readable medium as recited in Claim 43, wherein said spatial prediction further considers stationary 
45 sub-partitions of other portions in said video frame, and wherein, if at least one of said sub-partitions has a corre- 
sponding motion that is approximately zero, then said other portion is also considered to have zero motion. 

77. An apparatus comprising logic operativeiy configured to process at least a portion of data using a plurality of 
different quantization functions to produce a plurality of corresponding quantized portions of data, and selectively 

50 output one of said quantized portions of data based on at least one threshold value. 

78. The apparatus as recited in Claim 77, wherein said logic is further configured to divide initial data into a plurality 
of portions including said at least one portion of data and classify said at least one portion of data based on at 
least one classification characteristic, and wherein said at least one threshold value is associated with said at least 

55 one classification characteristic. 

79. The apparatus as recited in Claim 77, wherein said plurality of different quantization functions include at least two 
operativeiy different Deadzone Quantizers and wherein at least one of said Deadzone Quantizers includes an 
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adaptive coverage Deadzone Quantizer. 

80. The apparatus as recited in Claim 77, wherein said logic is further configured to perform Rate Distortion Optimi- 
zation (RDO) to select said one of said quantized portions of data. 

81. An apparatus comprising logic operatively configured to perform at least one characteristic analysis on at least 
one portion of image data : selectively establish at least one adaptive quantization parameter within an encoder 
based on said at least one characteristic analysis, and encode said at least one portion of image data with said 
encoder. 

82. The apparatus as recited in Claim 81, wherein said at least one adaptive quantization parameter is associated 
with an adaptive coverage Deadzone Quantizer within said encoder. 

83. An apparatus comprising logic operatively configured to cause at least one portion of image data to be encoded 
15 using at least two different Deadzone Quantizers, and identify preferred encoded data in an output of one of said 

at least two different Deadzone Quantizers based on a Rate Distortion Optimization (RDO) decision associated 
with at least one decision factor. 
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84. An apparatus comprising logic operatively configured to cause at least one portion of image data to be encoded 
using a first Deadzone Quantizer determine if an output of said first Deadzone Quantizer satisfies at least one 
decision factor, and, if so, then output said output of said first Deadzone Quantizer, else cause said at least one 
portion of image data to be encoded using at least a second Deadzone Quantizer that is different from said first 
Deadzone Quantizer. 

25 85. The apparatus as recited in Claim 84, wherein said logic is further configured to identify an acceptable encoded 
version of said at least one portion of said image data based on a Rate Distortion Optimization (RDO) decision. 

86. An apparatus comprising logic operatively configured to perform image analysis on at least one portion of image 
data, perform block classification on said analyzed portion of image data, perform Deadzone Quantization of said 

30 block classified portion of image data, and encode said Deadzone Quantized portion of image data 

87. An apparatus comprising logic operatively configured to cause at least one portion of video image data to be 
encoded using at least two different encoders wherein at least one of said two different encoders includes a Dead- 
zone Quantizer operatively configured to support a Non Residual Mode of said video image data, and identify 
preferred encoded frame data in an output of one of said at least two different encoders based on a Rate Distortion 
Optimization (RDO) decision associated with at least one decision factor. 

88. The apparatus as recited in Claim 87, wherein said Deadzone Quantizer is adaptive and selectively causes said 
Non Residual Mode to be used. 

89. The apparatus as recited in Claim 87, wherein at least one of said encoders is operatively configured to support 
at least one mode selected from a group comprising a Direct Mode of said video image data and a Copy Mode of 
said video image data. 

90. An apparatus comprising logic operatively configured to selectively vary at least one Lagrangian multiplier that is 
operatively configuring encoding logic having a quantizing function based on at least one characteristic of at least 
one portion of image data ; and encode said at least one portion of said image data using said encoding logic. 

91 . An apparatus comprising logic operatively configured to encode at least a portion of video image data using encoder 
logic, and cause said encoder logic to output syntax information identifying a type of motion vector prediction 
employed by said encoder logic. 
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92. An apparatus comprising logic operatively configured to encode video data, and selectively set at least one de- 
scriptor within a syntax portion of said encoded video data, said descriptor identifying an adaptive spatial/spatio- 

5 5 temporal encoding associated with at least one B frame encoded with said video data. 

93. An apparatus comprising logic operatively configured to encode video data : and selectively set at least one de- 
scriptor within a syntax portion of said encoded video data, said descriptor identifying an adaptive copy/motion- 
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copy skip mode in at least one inter frame encoded with said video data. 

94. An apparatus comprising logic operatively configured to selectively apply spatial prediction of motion associated 
with at least one portion of a video frame in a video sequence, and if temporal motion prediction information for a 
reference portion of another video frame is zero then setting.said spatial prediction of motion to zero. 
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The present European patent application comprised at the time of filing more than ten claims. 
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Only pari of the claims have been paid within the prescribed time limit. The present European search 
report has been drawn up for the first ten claims and for those claims for which claims fees have 
been paid, namely clatm(s): 



□ 



No claims fees have been paid within the prescribed time limit. The present European search report has 
been drawn up for the first ten claims. 



LACK OF UNITY OF INVENTION 



The Search Division considers that the present European patent application does not comply with the 
requirements of unity of invention and relates to several inventions or groups of inventions, namely: 
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□ All further search fees have been paid within the fixed time limit. The present European search report ha 
been drawn up for all claims. 



As all searchable claims could be searched without effort justifying an additional fee, the Search Division 
did not invite payment of any additional fee. 

Only part of the further search fees have been paid within the fixed time limit. The present European 
search report has been drawn up for those parts of the European patent application which relate to the 
inventions in respect of which search fees have been paid, namely claims: 
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None of the further search fees have been paid within the fixed time limit. The present European search 
report has been drawn up for those parts of the European patent application which relate to the invention 
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The Search Division considers that the present European patent application does not comply with the 
requirements of unity of invention and relates to several inventions or groups ot inventions, namely: 

1. claims: 1-11, 16-19, 46-52, 56-58, 77-80, 83-85 

relate to a method and apparatus for: 

a) processing at least a portion of data using a plurality 
of quantization functions, and 

b) outputting one of said quantized portions of data based 
on at least one threshold value. 



2. claims: 12-15, 53-55, 81, 82 



relate to a method and apparatus for: 

a) analysing image characteristics, 

b) selecting adaptive quantization, and 

c) encoding of the image. 



3. claims: 19-21, 59-61, 86 



relate to a method and apparatus for: 

a) analysing image characteristics, 

b) including Deadzone quantization, 

c) performing block classification based on image analysis, 
and 

d) encoding of the image. 



4. claims: 22-31, 62-67, 87-89 



relate to a method and apparatus for: 

a) encoding at least a portion of video image using at least 
two different encoders, 

b) where at least one encoder includes a Deadzone quantizer 
configured to support a Non Residual Mode, and 

c) identifying preferred encoded data of the output of said 
at least two encoders based on a Rate Distortion 
Optimization. 



5. claims: 32-36, 68, 90 



relate to a method and apparatus for: 

a) varying at least one Lagrangian multiplier, 

b) where Lagrangian multiplier is operatively configuring 
encoding logic, 

c) encoding logic has a quantization function, and 

d) encoding said image data using said encoding logic. 



6. claims: 37-42, 69-73, 91-93 
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The Search Division considers that the present European patent application does not comply with the 
requirements of unity of invention and relates to several inventions or groups of inventions, namely: 

relate to a method and apparatus for: 

a) encoding video data, and 

b) encoder selectively outputs syntax information 
identifying a type of motion vector prediction. 



7. claims: 43-45, 74-76, 94 



relate to a method and apparatus for: 

a) encoding video considering temporal or spatial stationary 
portions of video frames, 

b) selectively applying spatial prediction of motion, and 

c) conditionally setting spatial prediction of motion to 
zero. 
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